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Abstract 



It is shown that a unique measure of volume is associated with any 
statistical ensemble, which directly quantifies the inherent spread or 
localisation of the ensemble. It is applicable whether the ensemble is 
classical or quantum, continuous or discrete, and may be derived from 
a small number of theory-independent geometric postulates. Remark- 
ably, this unique ensemble volume is proportional to the exponential 
of the ensemble entropy, and hence provides a novel geometric charac- 
terisation of the latter quantity. Applications include unified volume- 
based derivations of the Holevo and Shannon bounds in quantum and 
classical information theory, a precise geometric interpretation of ther- 
modynamic entropy for equilibrium ensembles, a geometric derivation 
of semi-classical uncertainty relations, a new means for defining clas- 
sical and quantum localization for arbitrary evolution processes, a 
geometric interpretation of relative entropy, and a new proposed def- 
inition for the spot-size of an optical beam. Advantages of ensemble 
volume over other measures of localization (root-mean-squarc devia- 
tion, Renyi entropies, and inverse participation ratio) are discussed. 

PACS Numbers: 03.65.Bz, 03.67.-a, 05.45-hb, 42.60.Jf 



I INTRODUCTION 

This paper has two main goals. The first is to demonstrate that for any 
ensemble, whether classical, quantum, discrete or continuous, there is essen- 
tially only one measure of the "volume" occupied by the ensemble which is 
compatible with basic geometric notions. This ensem,ble volume is thus a 
preferred and universal choice for characterising what is variously referred to 
as the spread, dispersion, uncertainty, or localisation of an ensemble. 

Remarkably, the derived "ensemble volume" turns out to be proportional 
to the exponential of the entropy of the ensemble. A by-product of the 
first goal is thus a new universal characterisation of ensemble entropy, based 
on geometric notions. Indeed, a number of properties of ensemble entropy 
turn out to have simple geometric interpretations. The universal nature 
of the characterisation is of particular interest: the only previous context- 
independent interpretation of ensemble entropy to date (and hence applicable 
in particular to ensembles described by continuous probability distributions) 
appears to be as a somewhat vague measure of uncertainty or randomness. 
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The second goal is to apply "ensemble volume" to a wide range of con- 
texts in which ensembles appear. The applications demonstrate not only the 
advantages of ensemble volume over other measures of spread, but also to 
some extent why it is that ensemble entropy makes a natural appearance 
in contexts as diverse as statistical mechanics, information theory, chaos, 
and quantum uncertainty relations. Some results have been briefly reported 
elsewhere Here important details and extensions are given, as well as a 
number of new results. 

The work reported here was originally motivated by several connections 
between volume and information. Shannon proved an upper bound on infor- 
mation transfer, via classical signals subject to quadratic energy and noise 
constraints, by considering ratios of spherical volumes in high-dimensional 
spaces 0. One can similarly obtain approximate upper bounds on infor- 
mation for quantum signals, via semi-classical arguments involving ratios 
of phase space volumes [0, which in some cases turn out to be exact. 
This raises the question of whether there is some general measure of volume 
which can be used to derive rigorous information bounds for the general case. 
This question is answered affirmatively here, and a new unified derivation of 
the classical Shannon and the quantum Holevo information bounds is given, 
based on simple volume properties. 

There are also a number of connections which have been made previously 
between volume and entropy. For example, derivations in statistical me- 
chanics typically obtain heuristic expressions for thermodynamic entropy by 
counting "microstates" in a phase-space volume of "small" thickness contain- 
ing a constant-energy surface P|. Ma in an interesting approach attempted to 
define the thermodynamic entropy of a system in classical statistical mechan- 
ics as proportional to the logarithm of a phase-space volume corresponding 
to the "region of motion" of the system ||^, although he could not rigorously 
define the latter region. A precise geometric interpretation of thermodynamic 
entropy for both classical and quantum equilibrium ensembles will be given 
here. 

Further, Leipnik introduced the exponential of the position entropy of 
a quantum system as a measure of its "volume", and favourably compared 
the associated uncertainty relations for position and momentum with the 
usual Heisenberg uncertainty relations (see also the review in and Sec. 
II. C below). Generalisations to other measures of "volume" were given by 
Zakai [§,0. It is demonstrated here that the former measure has a unique 



3 



geometrical significance, and a geometrical derivation of quantum uncertainty 
relations is given based on the property that quantum states have a minimum 
ensemble volume. 

Zyczkowski |]10[ and more recently Mirbach and Korsch |Tl|, have used 
entropy as a measure of "localisation" for chaotic quantum and classical 
systems for various initial states. The results of the present paper show 
that this measure can be simply related to the spread of ensemble volume for 
arbitrary evolution processes, and provide support for the use of this measure 
over all other localisation measures. 

Rather than going immediately to general postulates for volume, and for- 
mal proofs of uniqueness, the following section first explores ensemble volume 
for a familiar class of ensembles: those described by one- dimensional proba- 
bility distributions. In this case the ensemble volume reduces to a "length" , 
which is calculated for a number of concrete examples and compared with 
other measures of uncertainty such as root-mean-square deviation. Geo- 
metric properties of this "length" and an associated quantum uncertainty 
relation are discussed. Two-dimensional joint probability distributions are 
also briefly discussed, where the ensemble volume becomes an "area" that 
is geometrically related to the "lengths" of the marginal distributions. This 
"area" motivates a new definition for the spot size of an optical beam. 

In Section III and an accompanying appendix, the derivation of the en- 
semble volume from universal geometric postulates is given. These postulates 
depend on theory-independent notions of invariance, projection onto orthog- 
onal axes, and additivity, and in particular are independent of whether the 
ensemble is classical or quantum. The bonus of a new geometrical charac- 
teristion of ensemble entropy is discussed, and a geometrical interpretation 
of relative entropy is given. 

Applications to statistical mechanics, semi-classical quantum mechanics, 
information theory, chaos and other types of dynamical evolution are given 
in Section IV. Conclusions are presented in Section V. 



II 1- AND 2-DIMENSIONAL EXAMPLES 

Before deriving the unique form of ensemble volume in Sec. Ill, it is useful 
to first consider some of its properties and connections to other measures of 
uncertainty in two familiar settings: continuous distributions on the line and 
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on the plane, for which "volume" reduces to the special cases of "length" and 
to "area" respectively. These special cases are already sufficient to exemplify 
a number of general features of ensemble volume, and its advantages as a 
measure of spread. 



A Length 

Consider a 1-dimensional probability distribution p{x), corresponding to 
some random variable X (e.g., position, momentum, or phase). There are 
then a number of candidates for a direct measure of the "uncertainty" or 
"spread" of X, the most well known being the root-mean-square (RMS) de- 
viation 

AX = [[ dxx^p{x) -{ dxxp{x)f]^''^. (1) 

This quantity is a "direct" measure in the sense of having the same units as 
X, and has the virtues of being invariant under translations and reflections, 
scaling linearly with X {AY = XAX for Y = XX), and vanishing in the limit 
that X has some definite value x'. 



A second candidate is the inverse participation ratio |10, 12, 13 



ix = [j dxp{xf]-\ (2) 

(which may also be recognised as a monotonic function of the so-called "lin- 
ear entropy" — / dxp{xY [01)- This quantity shares all of the above-noted 
virtues of AX. However, it is in fact only a special case of what may be 
called the "Renyi length" 



dxpixf+'^Y^''' (a>-l) (3) 



(named for its logarithm - a generalised entropy defined by Renyi [IS]!). Renyi 



lengths are directly related to measures of uncertainty considered by Zakai 
for quantum systems [§, H, and use of their reciprocals as (indirect) measures 
of uncertainty has been extensively investigated in |T6[ (see also [l^l)- The 
inverse participation ratio corresponds to a = 1 in Eq. (^. 

The Renyi length Lx,a in Eq- (0) satisfies all of the above-noted properties 
of AX (same units as X, translation/refiection invariance, scaling linearly 
with X, and vanishing as p{x) approaches a delta function). Eq. (Rf) thus 
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introduces an uncountable infinity of possible candidates for a direct measure 
of uncertainty! Fortunately, as will be seen in Sec. Ill, just one of these Renyi 
lengths may be singled out uniquely over all other possible measures on 
geometric grounds. 

In particular, in this paper special attention will be paid to the case a ^ 
in Eq. (0). The corresponding length will simply be denoted by Lx, and is 
just the exponential of the usual ensemble entropy il 



Lv=iv. = exp|-/dxpWlnpW|. (4) 

It is a special case of the "ensemble volume" to be derived in Sec. Ill, and 
will therefore be referred to as the ensemble length. 



B Comparisons 

In Table I the RMS deviation and ensemble length are calculated for several 
types of 1-dimensional distributions. As noted following Eqs. (|l]) and (||) both 
quantities are invariant under translations and scale linearly with X. Hence 
they can be trivially calculated for distributions of the form p{x/a — x')/a 
once they have been found for p{x) (by simply multiplying the result for the 
latter case by a) . The Table will be used to highlight a number of differences 
between AX and Lx- 

First, it is seen from Table I that the ensemble length exists in cases when 
the RMS deviation does not (for Cauchy-Lorentz and sink-squared distribu- 
tions in particular). It may further be shown that Lx is finite whenever AX 
is: the well known variational property that ensemble entropy is maximised 
for a fixed value of AX by a Gaussian distribution |[T^ immediately implies 
from the scaling property and Table I that 

Lx < (27re)^/^AX. (5) 

Thus the use of ensemble length as a measure of uncertainty allows a wider 
quantitative range of applicability than does RMS deviation. This permits, 
for example, the quantitative discussion of quantum uncertainty relations, 
expressed in terms of ensemble length, for cases in which the usual Heisenberg 
uncertainty relations have nothing to say (see following subsection). 

Second, the calculations for the uniform and circular distributions, pu 
and pc in Table I respectively, exemplify a maximality property of ensemble 
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length: it is maximised on a given interval by a uniform distribution on the 
interval, with a maximum value equal to the length of the interval. Thus one 
may write 

Lx<L (6) 

for a distribution confined to an interval of length L [|20[. This property 
reflects the intuitive notion that that p{x) is most spread out or least localised 
when it is flat, having no peaks where probability is concentrated. The RMS 
deviation does not conform to this notion, achieving its maximum possible 
value in the limit of two maximally-separated peaks (a distribution equally 
concentrated on the endpoints of the interval). 

Third, the calculation in Table I for the uniform and double-uniform 
distributions pu and p^u illustrates an addivity property of ensemble length: 
the ensemble length of p^u is twice that of the two non-overlapping uniform 
distributions pu{,x — a) and pu{—x — a) which it comprises in equal mixture. 
More generally, if p{x) and q{x) denote two non-overlapping distributions 
of equal ensemble length L, then any mixture Xp{x) + (1 — A)g(x) of these 
distributions satisfies 

Lx < 2L, (7) 



with the upper bound achieved for A = 1/2 This property reflects the 
intuitive notions that such a mixture is least localised (most spread out) when 
it is not more concentrated in one of the non-overlapping regions than in the 
other, and that for this equally-weighted case the non-overlapping lengths 
simply add. In contrast, the RMS deviation of pou depends strongly on 
the separation of the peaks, and indeed becomes infinite as this separation 
increases. This example and the one above emphasise what can be directly 
seen from Eq. (|I|): the RMS deviation is a measure of separation of the 
region(s) of concentration from a particular point of the distribution (the 
mean value), rather than a measure of the extent to which the distribution 
is in fact concentrated. 

Fourth, except in cases where the second moment of p{x) has some par- 
ticular physical meaning, it is difficult to assess the significance of a given 
value of AX without some further information about the distribution. For 
example, even for single-peaked distributions, the probability that X lies 



within ±AX of the mean is highly dependent upon the nature of p{x) 

In contrast, as will be seen in Sec. Ill, the ensemble length Lx has a unique 

geometrical significance. 
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Finally, it is of interest to make a quantitative comparison between the 
degrees to which a given distribution p{x) is concentrated in a region of length 
Lx on the one hand, and of length 2AX on the other. To do so, it is natural 
to define the maximum confidence corresponding to a given length L as 

C{L) = sup { / dxp{x)}, (8) 

{A:\A\=L} Ja 

where the supremum is over all measurable sets A of total length L. In 
the case of a distribution symmetric about a single peak this is achieved by 
choosing A to be the interval of length L centred on the mean value of the 
distribution. 

From Table I one can calculate the values of C{Lx) to be approximately 
100%, 99%, 96%, 93%, 91% and 90% for the uniform, circular, gaussian, ex- 
ponential, sink-squared and Cauchy-Lorentz distributions respectively. The 
corresponding values of C(2AX) are 58%, 61%, 68%, 86% for the first four 
of the above distributions, with the value being undefined for the last two. 
It is seen that for these examples C{Lx) varies over a much narrower range 
than C(2AX), and that Lx typically corresponds to a larger confidence value 
than 2 AX. 



C Uncertainty relations 

The relationship between ensemble length and ensemble entropy in Eq. (^) 
allows the usual entropic uncertainty relation for the position and momentum 
of a quantum particle |2^ to be equivalently written in the geometric form 



LxLp > ireh, (9) 

relating the product of the ensemble lengths to a minimum area in phase 
space. Bounding Lx and Lp from above via Eq. (|^) then immediately yields 
the well known Heisenberg uncertainty relation 

AXAP > h/2. (10) 

The above two inequalities are similar in form, and have the same broad 
physical significance: the particle cannot be prepared in a state for which 
both the position and momentum distributions have arbitrarily small spreads. 
However, it is seen that the latter inequality is mathematically weaker, as it 
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follows from the former. For example, it follows from Eq. (Q) that Lp (and 
hence, via Eq. (|^), AP) becomes infinite as p{x) approaches a weighted sum 
of delta functions. This cannot be concluded from Eq. ([TD|). 

Inequality (|^) may used to make quantitative evaluations regarding the 
relative spreads of position and momentum in cases where the Heisenberg 
inequality (p!OD yields no information. For example, consider a quantum 
particle confined to an interval of length L, such that the position amplitude 
is constant over the interval. It follows that the momentum statistics are 
described by the sink-squared distribution 

TT-\2h/L){sm\pL/{2h)]/py. (11) 

As noted in Table I the RMS deviation AP is not defined in this case, and 
hence the Heisenberg inequality cannot be used to assess the degree to which 
position and momentum are jointly localised. In contrast, using Eq. (|TI|), 
Table I and the scaling property of ensemble length, one finds 

LxLp = 27rexp[2{l-C)]h^l5h, (12) 

where C ~ 0.57721566 denotes Euler's constant. Hence the particle has an 
associated phase space area close to the lower bound of neh ^ 9h in Eq. (|^); 
i.e., the particle is in fact in an approximate minimum uncertainty state of 
position and momentum. 

A similar example is the case of a particle confined to the positive x-axis, 
with a position amplitude that decays exponentially with x. The position and 
momentum distributions are then given by exponential and Cauchy-Lorentz 
distributions of the forms pE{x/a)/a and 2apcL{'^ap/h)/h respectively, im- 
plying via Table I and the scaling property that 

LxLp = 2neh. (13) 

Hence the state is relatively well-localised in position and momentum, with 
an associated phase-space area only twice that of the minimum in Eq. (|^). 
Again, the Heisenberg uncertainty relation Eq. ([lOD gives no information 
about the joint localisation in this case. 

Finally, it may be mentioned that there is an uncertainty relation relating 
the Renyi lengths of position and momentum for general a: it follows from 
Eq. (131) of [g that 

Lx,aLp,f3 > nh[l + 2a]i+i/(2a)/(^ ^ ^) (^4) 
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for a > —1/2, where (3 = —a/{l + 2a). For a = (3 = the lower bound is 
maximum, and the inequahty reduces to Eq. (j^) above. 

D Area and spot size 

This section will be concluded by briefly looking at measures of spread for 
two-dimensional distributions, to highlight a further geometric property of 
ensemble length of importance in later sections. This property also holds for 
RMS deviation, but not for Renyi lengths in general. A related measure of 
spot size for optical beams is defined and briefly discussed. 

Each of the "length" measures in Eqs. (|l]), @ and (H) has a natural gen- 
eralisation to a measure of "area" , corresponding to the spread or uncertainty 
of a 2-dimensional probability distribution p{x,y) of two random variables 
X and Y: 

AA = [det((xx^)-(x)(x^))]l/^ (15) 

AxY,a = (P")-'/", (16) 

AxY = exp[(-lnp)] (17) 

respectively, where x denotes the column vector {x,y), x"^ its transpose, and 
(■) the average with respect to p. These areas satisfy properties analogous 
to to their 1-dimensional counterparts, and will be referred to as the RMS 
area, Renyi area, and ensemble area respectively. 

The RMS area in Eq. ([T5|) may be recognised as the product of the RMS 
deviations along the principal axes of the distribution in the xy-plane, and 



in general satisfies the inequality (Eq. (2.13.7) of p^ ) 



AA < AX AY, (18) 

with equality for the case that p{x, y) factorises into two uncorrelated distri- 
butions for X and Y. 

This inequality for "area" and "length" has a simple geometric interpre- 
tation, to be generalised in the following Section. In particular, the marginal 
distributions pi{x) and P2{y) for X and Y are obtained by "projecting" the 
joint distribution p{x, y) onto the two orthogonal x and y axes. The asso- 
ciated RMS lengths AX and AY may be similarly thought of as obtained 
by "projecting" the RMS area AA onto these axes. However, this is only 
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consistent with Euclidean geometry if inequality ( [I8| ) holds: the product of 
the two lengths obtained by projection of an area onto two orthogonal axes 
can never be less than the original area. 

Ensemble area and ensemble length are also consistent with this "pro- 
jection" interpretation: the well known subadditivity of entropy can be 
equivalently written via Eqs. and (0) as 

AxY < LxLy, (19) 



in analogy to Eq. (|T8[). The subadditivity of entropy is thus seen to corre- 
spond to a projection property of Euclidean geometry. One has the further 
related property that if p{x, y) is uniform on a rectangular region oriented 
parallel to the x and y axes, and vanishes outside this region, then equality 
holds in Eq. ([T9|) with Lx and Ly corresponding to the lengths of the sides of 
the rectangle. Thus Eq. (|19|) reduces in this case to the Euclidean property 
area = length x breadth. In general, the Renyi areas in Eq. (|16|) are not 
consistent with the projection property, as will be seen in Sec. III. 

Finally, it may be noted that Eq. (|l^) may be applied to physical distri- 
butions other than probability distributions, with corresponding geometrical 
advantages. For example, let P{x, y) denote the time-averaged power distri- 
bution in some plane orthogonal to the direction of propagation of an optical 
beam. One may then define the "geometric" spot size of the beam as the 
ensemble area of the normalised power distribution P{x,y)/PT, where Pt is 
the integrated power over the plane: 

Ageom = Prexpl-iPT)-' J dxdyP{x,y)\nPix,y)]. (20) 

This satisfies desirable properties such as being additive for non-overlapping 
identical beams, being invariant with respect to scaling the power up or down, 
scaling linearly with beam magnification, having a maximum value of A for 
a beam confined to an area A (attained for a uniform power distribution over 
that area), and satisfying a "projection property" analogous to Eq. ([I9l). It 
is also invariant under any transformation of coordinates which preserves 
area in the usual sense (i.e., with unit Jacobian), and so to this extent is 
independent of the coordinatisation of the plane. Alternative definitions 
based on, for example, Eqs. (|15D or (|16D are geometrically less satisfying. 
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Ill ENSEMBLE VOLUME 



The previous section indicates the wide range of possible measures for the 
spread of one- and two-dimensional probabihty distributions, and draws 
attention to a number of geometric and other advantages enjoyed by the 
"length" and "area" defined in Eqs. @ and ( |T7|) respectively. 

As noted in the Introduction, it has often proved useful to employ var- 
ious notions of "volume" for statistical ensembles across a wide variety of 
contexts, such as information theory, statistical mechanics, uncertainty re- 
lations, and chaotic evolution. Other contexts include Ornstein-Uhlenbeck 
diffusion and semi-classical quantum mechanics (see and Sees. IV. B and 
IV.D below). This raises the question of whether there is in fact some uni- 
versal measure of "volume" for classical and quantum ensembles, which may 
be usefully employed in all of the above contexts and which is not restricted 
in application or interpretation to various special cases. 

Here it will be shown that indeed such a measure exists, which may 
be uniquely derived from a small number of theory-independent postulates 
fundamental to the concept of "volume". It generalises the ensemble length 
and ensemble area of the previous section, and will be referred to as the 
ensemble volume. It also leads to new geometric characterisations of entropy 
and relative entropy. 

A Notation 

Three generic types of ensemble will be considered here. The first is a classical 
ensemble described by a continuous probability distribution p(x) on some 
n-dimensional space X; the second is a classical ensemble described by a 
discrete probability distribution {pi} where i ranges over some discrete set 
/; and the third is a quantum ensemble described by a density operator W 
on some Hilbert space H. 

Each of the above types of ensemble shares some universal features. It 
is essential to abstract a number of these features via a common notation if 
"volume" is to be discussed in a theory-independent manner. 

For example, consider the three identities 




(21) 
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Defining F to correspond respectively to the spaces/sets X, I and H; Trr[-] 
to correspond respectively to integration over X, summation over /, and the 
trace over H; and p to correspond respectively to the ensembles p(x), {pi}, 
and W; these identities can be subsumed into the generic identity 

IVr[p] = 1. (22) 

Another universal feature is the notion of composite or joint ensembles: for 
a given pair of spaces/sets Fi, F2 of a given type one can define a composite 
set/space F12, where for classical and quantum ensembles corresponds to 
the set product and the tensor product respectively of Fi and F2. Further, 
if p describes a composite ensemble on F12 one may define two projected 
ensembles pi, p2 on Fi and F2 respectively, via 

pi^IVrJp], p2 = TVrJp]. (23) 

These projected ensembles correspond to marginal distributions and reduced 
density operators for the cases of classical and quantum ensembles respec- 
tively. 

Finally, one may define any two ensembles p, p' of the same type to be 
non-overlapping if and only if 

Trr[pp'] = 0. (24) 

Note that in general two ensembles are non-overlapping if and only if they 
can be distinguished by measurement without error. 

B Postulates for volume 

For the three types of ensemble discussed in the previous subsection it is 
useful to think of "volume" in the following ways. First, for a continuous 

distribution p(x) on a space X, the volume corresponds to a direct measure 
of the region of "spread" of p(x) in X. Second, for a classical discrete dis- 
tribution {pi}, one may imagine the indices as labelling a set of boxes or 
bins. In this case "volume" corresponds to the spread of the distribution 
over these bins, i.e., as a continuous measure of the effective number of bins 
occupied by the distribution. Third, for a quantum ensemble, the volume 
may be considered as a continous generalisation of Hilbert space dimension, 
corresponding to a measure of the spread of the ensemble in Hilbert space. 
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Consider now a measure of volume, V{p), which satisfies the following 
properties: 

(i) Invariance Property: V{p) is invariant under all transformations on 
r which preserve Trr[-] (these are represented by measure-preserving trans- 
formations on X for continous classical ensembles, permutations on / for 
discrete classical ensembles, and unitary transformations on H for quantum 
ensembles) . 

(a) Cartesian Property: If p describes two uncorrelated ensembles pi and 
P2 on Fi and r2 respectively, then 

Vip) = Vip,)Vip2) (25) 

(note p is the product pip2 for classical ensembles, and the tensor product 
Pi®p2 for quantum ensembles). 

(in) Projection Property: If p describes an ensemble of composite systems 
on then 

V{p) < V(pi)V(p2), (26) 

where pi, p2 are the projections of p defined in Eq. (P^D- 

(iv) Additivity Property: An equally-weighted mixture of m non- overlapping 
ensembles pa, pb, ■ ■ ■, each of equal volume V, has a total volume of mV, i.e., 

V{m-'[pa + Pb + ...]) = mV. (27) 

(v) Uniformity Property: If p is any mixture of m non-overlapping en- 
sembles of equal volumes V, then 

V{p) < mV. (28) 

The above properties are essentially the same as those defined in [|^], 
where the additivity and uniformity properties were combined in the latter. 
Their geometrical significance is as follows. 

First, the invariance property (i) ensures that the volume V{p) is a func- 
tion of the ensemble alone, independently of a particular co-ordinatisation, 
labelling, or measurement basis for F. Indeed, the transformations which 
preserve Trr[-] are exactly those which preserve volume, or measure, on F 
in the usual sense. For example, for a classical distribution p(x) on X the 
measure of a subset S* C X is given by 

15 1=/ rf"x = Tr5[l]. (29) 
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The invariance property then requires that the ensemble volume is invari- 
ant under all transformations which preserve the measure of all subsets, i.e., 
those transformations with unit Jacobian. For the case of a classical phase 
space such transformations include all canonical transformations, and hence 
V{p) will be invariant under Hamiltonian evolution. One may similarly con- 
sider the measure | S \= Tr5[l] of subsets SCI and subspaccs S C H; in 
these cases the invariance property again requires that V{p) is invariant un- 
der measure-preserving transformations, corresponding to permutations and 
unitary transformations respectively. 

Second, the Cartesian property (ii) is exactly analogous to the geometric 
property that area equals length times breadth, and more generally that the 
volume of the Cartesian product of two sets is equal to the product of the 
volume of the sets. This is illustrated in Figure 1. 

Third, the projection property (iii) is exactly analogous to the geometric 
property that a volume is less than or equal to the product of the lengths 
obtained by its projection onto orthogonal axes, and is illustrated in Figure 
2. It is a generalisation of the projection property discussed for RMS area 
and ensemble area in Sec. II. D. 

Fourth, the additivity property (iv) requires the ensemble volume to be 
additive for a uniform mixture of non-overlapping ensembles of equal volume. 
The geometric interpretation of this is self-evident: the total volume of m 
equal non-overlapping volumes is the sum of the individual volumes. 

Finally, the uniformity property (v) states that the maximum volume, 
of a mixture of non-overlapping ensembles of equal volume, is bounded by 
the sum of the component volumes. Thus, noting the additivity property, 
this maximum is achieved for a uniform mixture, i.e., one which is not more 
concentrated on one of the component ensembles than on any other. 

C Derivation 

Here the unique, universal measure of volume for ensembles is obtained. It 
may more generally be applied as a measure of spread for any positive classi- 
cal or quantum density, such as beam intensity or mass density, by calculating 
the "volume" of the corresponding normalised density. In such cases, where 
no ensemble is involved, one could alternatively label this quantity as the 
"geometric dispersion" . 
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In particular, one has the following result, first stated in and proved 
in the Appendix: 

Theorem: Any (continuous) measure of volume satisfying properties (i)- 
(v) above has the form 

V{p) = K{T)e^^p\ (30) 
where S{p) denotes the ensemble entropy 

5(p) = -Trr[plnp], (31) 

and K{r) is a constant which may depend on F, and satisfies 

K{Tu) = K{Ti)K{T2). (32) 

The proof in the Appendix primarily relies on applying properties (i)-(v) 
to an arbitrarily large number of independent copies of a given ensemble p. 
I believe it may be possible to prove the theorem without the uniformity 
property (v), but have not been able to do so. 

The constant K{r) in Eq. (|30D is a normalisation constant, refiecting the 
notion that only relative volumes are of real interest in comparing different 
ensembles. For continuous classical ensembles a natural choice is K{r) = 1, 
so that a distribution which is uniform over a set S of measure V, and 
vanishes outside S, has ensemble volume equal to V. 

For discrete classical ensembles the choice K{r) = 1 corresponds to mea- 
suring the ensemble volume in terms of the number of "bins" occupied by the 
ensemble, with the minimum volume of 1 bin corresponding to a distribution 
with Pi = 1 for some index i. However, if the distribution arises from the 
discretisation of a continuous observable such as position (due to measure- 
ment limitations for example), then it would be natural to choose K(r) to 
correspond to the discretisation volume. If the index set is finite, with M 
labels, another possible choice for K{r) is 1/M. The ensemble volume then 
measures the fraction of the total volume occupied by the ensemble. 

For quantum ensembles the choice K{r) = 1 corresponds to measuring 
the ensemble volume in terms of the number of Hilbert space dimensions 
occupied by the ensemble, with pure states occupying the minimum possible 
of 1 dimension. However, if the Hilbert space H has finite dimension M then 
one could alternatively take K{r) = 1/M, corresponding to a fractional mea- 
sure of volume in analogy to the classical case. Finally, for quantum systems 



16 



with classical counterparts, such as spin- zero particles, one may choose K{r) 
so that in the classical limit the quantum ensemble volume reduces to the 
classical ensemble volume. This is explored further in Sec. IV. B, and used 
to obtain semi-classical uncertainty relations. 

It should be noted that the assumption of continuity in the statement 
of the theorem is necessary. For example, one may for a discrete classical 
ensemble {pi} define the "support volume" as the number of non-zero Pi 
values. This satisfies all of properties (i)-(v), but is not continuous. The 
simplest counterexample is the discrete probability distribution {1 — e, e} for 
e > 0. As e — s> this distribution continuously approaches the distribution 
{1,0}, with a support volume of 1; however for all e > the support volume 
is 2. 

If one defines the "RMS" volume for an n-dimensional observable x by 
generalising Eq. ( [T^ ) to arbitrary dimensions it is not difficult to show 
that the invariance property (restricted to /mear transformations), the Carte- 
sian property, and the projection property are satisfied. However it does not 
satisfy the additivity and uniformity properties. Further, the "Renyi" vol- 
umes 

V^ip) = (Trr[p^+1)-^/", (33) 

defined in analogy with the Renyi length and Renyi area in Eqs. (^ and 
(|16D respectively, satisfy properties (i), (ii), (iv) and (v) for all a > — 1. 
However, a counterexample given by Renyi (Theorem 4 of Sec. IX.6 in |T^) 
shows that the projection property is not satisfied, except for the cases a = 
(corresponding to Eq. (30) with K{r) = 1), and a = —1 (corresponding to 
the discontinuous case of "support volume" discussed above). 



D Geometric characterisation of entropy 

The appearance of the ensemble entropy in Eq. ( PP]) as a result of geometric 
postulates (i)-(v) provides a new approach to this quantity, which is moreover 
independent of whether the ensemble is classical or quantum, discrete or 
continuous. In particular, ensemble entropy may be defined (up to an additive 
constant) as the logarithm of the ensemble volume, where the latter is taken 
to be the primary quantity. The properties of ensemble entropy may thus be 
regarded as being geometric in origin. Indeed, it will be seen that its natural 
appearance in a number of physical contexts can be interpreted as following 
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from its relationship to a "volume" . 

The geometric interpretation of ensemble entropy contrasts markedly with 
its only other context-independent interpretation as an (indirect) measure 



of "uncertainty" or "randomness" [|T^, |T6|, [T^, 0, |2^. Indeed ensemble 
volume provides a direct measure of uncertainty, which is advantageous when 
one wishes to compare the spreads of two ensembles of a given type (i.e., with 
the same F). For example, if two ensembles have entropies of 0.5 bits and 
1.5 bits respectively [Q, should one compare their ratio or their difference 



in assessing the degree to which the uncertainty of the first exceeds that of 
the second? Since entropies are typically only defined up to a multiplicative 
constant (see below), one might consider the ratio to be the more signicant 
means of comparison. However, the ensemble volume gives an unequivocal 
answer: the volume of the second ensemble is twice that of the first in this 
case, and hence has twice the spread. 

It is interesting to briefly compare the derivation of ensemble volume from 
properties (i)-(v) with existing axiomatic derivations of ensemble entropy. 



Such axiomatic derivations are reviewed in |23], and are all related to the 



original derivation given by Shannon |T9|, |26|. Unlike the theorem of the 
previous section they are limited to discrete classical ensembles. Moreover, 
they lead to an arbitrary multiplicative constant for entropy, whereas the 
geometric approach leads to an arbitrary additive constant for entropy. 

To see that the axioms used by Shannon and others are markedly dif- 
ferent from properties (i)-(v) used to derive ensemble volume, consider the 
"grouping axiom" of Shannon (see also Sec. 1.2 of |1^), which may be 
written in the notation of this paper as: 

SiXp + (1 - X)p') = 5({A, 1 - A}) + XSip) + (1 - A)S(p') (34) 

for any two non-overlapping discrete classical ensembles p, p' . Thus it is as- 
sumed that the "randomness" 5'(-) of a mixture of non-overlapping distribu- 
tions is equal to that of the mixing distribution plus the average randomness 
of the individual ensembles. This axiom, together with a continuity assump- 
tion and a symmetry assumption equivalent to the invariance property (i), is 
sufficient to derive the form ^(p) = —CYl,iVi^^Vi for the entropy of discrete 



classical ensembles, where C is an arbitrary constant |29 



Eq. (^Ij) does not translate into a natural axiom for ensemble volume: 
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replacing S by In gives the equivalent constraint 



v{\p + (1 - \)p') = vi{\, 1 - mvip)nvip')Y-\ (35) 

which has no simple geometric interpretation. Conversely, the additivity 
property Eq. (pT]), that non-overlapping equal volumes add, translates under 
V —* exp S into the "randomness" constraint 

S(p/2 + pV2) = ln2 + ^, (36) 

which is not a natural property to postulate for a measure of "randomness" . 
The geometric approach to ensemble entropy given here thus differs signifi- 
cantly from former approaches (as is also apparent from comparing the proof 
in the Appendix with those in |]T^ ^ pU|). 



Finally, it is of interest to note that the concavity property of ensemble 
entropy, S{J2i Kpi) > J2i ^iS{pi) is equivalent to an inequality relat- 



ing the volume of a mixture to the weighted geometric mean of the volumes 
of its components: 

^(Ev.)>n[np.)]'^ (37) 

i i 

This may be regarded as a generalisation of the uniformity property Eq. (p8|), 
as it implies that uniform mixtures have the greatest volumes. Note that the 
ensemble volume may itself be regarded as a weighted geometric mean (e.g., 
of the function p{x)~^ with respect to p(x) for continous classical ensembles; 



see sections 2.2 and 6.7 of p4| ) 



E Relative entropy 

The relative entropy of two ensembles p and a may be defined in a context 
independent manner by fS^ 



S{p I a) = Trr [p(ln p - In a)] . (38) 

It is asymptotically related to the probability of mistaking ensemble p for 
ensemble a, as is reviewed in [^. Here it will briefly be indicated how a 
geometric interpretation of this quantity can be given. 

Consider a compact n-dimensional space X which is divided up into into a 
set of non-overlapping bins {-Bj} (e.g., for measurement purposes). A discrete 
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probability distribution {pi} over the bins (e.g., corresponding to measure- 
ment results), may then also be modelled by the continuous distribution p(x) 
on X defined by 

p(x) = pi/Vi, X G B„ (39) 

where Vi = /g. (i"x denotes the measure of bin Bi. Thus p(x) is uniform 
over each bin, and its integral over bin Bi is equal to pi. Let po and pc 
denote the discrete and continuous ensembles corresponding to {pi} and p(x) 
respectively. 

Now, as discussed earlier, the ensemble volume V{pd) is proportional to 
the effective number of bins occupied by po- However, this does not indicate 
the effective volume or spread of the ensemble relative to X, particularly in 
the case of varying bin-sizes Vi. The latter is given by V{pc), which, making 
the choice K{r) = 1, follows from Eq. ( |5^ ) as 

V{pc) = eM-J2P^HP^/Vi}]. (40) 

i 

Note that in the case of equal bin-sizes Vi = V this reduces to the bin-size V 
multiplied by the effective number of bins occupied, expS'(pD)- 

Finally, if X has total measure J2iVi = Vx, one may define the "weight- 
ing" ensemble ao as corresponding to the discrete probability distribution 
{Vi/Vx}- Thus describes the relative sizes or weightings of the bins. It 
then follows via Eqs. (|38D and (|^) that 

Vipc)/Vx = e-^^^-l-^-). (41) 

Hence the relative entropy S{p \ a) is directly related to the volume of a 
discrete ensemble p embedded in a continuous space, where a characterises 
the distribution of bin sizes of the embedding. Note that this geometric in- 
terpretation of relative entropy allows its properties to be understood as 
corresponding to ratios of volumes. For example, the volume of an ensemble 
on X can never be greater than Vx (corresponding to a uniform distribu- 
tion on X). Hence the left-hand-side of Eq. (^) is never greater than unity, 
implying that 

S{p I (x) > 0. (42) 



20 



IV APPLICATIONS 



The results of Sec. II for ensemble length and ensemble area indicate the 
usefulness of ensemble volume as a direct measure of the spread of an en- 
semble (and of other positive densities such as optical beam power). Here 
other applications will be examined, in the contexts of statistical mechanics, 
semi-classical quantum mechanics, information theory, and quantum chaos. 
A particular result of note is a new unified proof of the classical Shannon 
information bound and the quantum Holevo information bound based on ra- 
tios of ensemble volumes. For the quantum case this proof is conceptually 
and technically far simpler than previous proofs. 

A Statistical mechanics 

First, in the statistical mechanics context, the Gibbs relation Sth = kS{p) 
between thermodynamic entropy and ensemble entropy for equilibrium en- 
sembles can be rewritten via Eq. (0) as 

Sth = k\n[V{p)/K{T)]. (43) 

Thus, the thermodyamic entropy is (up to an additive constant) proportional 
to the logarithm of the ensemble volume. 

From Eq. (^) and the third law of thermodynamics (that thermody- 
namic entropy vanishes at absolute zero), it follows that one should choose 
K[r) to correspond to a minimum "zero-temperature" ensemble volume. For 
quantum ensembles one has from Eqs. (|30|) and (|3l|) that V{p) = K(T) for 
pure states, i.e., the quantum zero-temperature volume is just that of a pure 
state on F. Similarly, for discrete classical ensembles, K{r) is the volume of 
the "pure" ensemble described by {1, 0, 0, . . .}. However, continuous classical 
ensembles violate the third law and K{r) remains arbitrary in this case 
(but see Sec. IV. B below). 

The geometric expression (|43|) is very similar to the original Boltzmann 
relation 

Sth = kin W, (44) 

where W is the number of distinct microstates or "elementary complexions" 
consistent with the thermodynamic description. Indeed, from the above dis- 
cussion it follows that Eq. (|i3| ) provides a precise geometric interpretation of 



21 



the Boltzmann relation for discrete classical and quantum equilibrium ensem- 
bles: thermodynamic entropy is proportional to the logarithm of the number 
of n on- overlapping zero-temperature volumes contained within the total vol- 
ume of the ensemble. Thus the Boltzmann relation and the Gibbs formula 
for thermodynamic entropy become directly unified in the ensemble volume 
approach, without appeals to reservoirs, microcanonical ensembles, etc. 

Properties of thermodynamic entropy can be reinterpreted in terms of 
geometric volume. For example, the additivity of thermodynamic entropy 
for uncorrelated ensembles in thermal equilibrium follows from Eq. (^31) and 
the Cartesian property Eq. (|25| ) for uncorrelated ensemble volumes. Note also 
that irreversible processes correspond geometrically to those which increase 
the volume of the ensemble. 



B Semi-classical quantum mechanics 

Consider now a classical ensemble pc which is the "classical limit" of some 
quantum ensemble pg, i.e., the physical properties of pc approximate those 
of pq. Such ensembles exist, for example, for equilibrium ensembles in the 
high-temperature limit and for the coherent states of a harmonic oscillator. 

For the case of a spinless particle associated with a 2n-dimensional phase 
space one can obtain a relationship between the constants K{rc) and K^Tq) 
in Eq. ( pO]) by requiring that the ensemble volumes V{pc) and V{pq) are ap- 
proximately equal for such ensembles. Since these constants are independent 
of the dynamics of the ensemble it suffices to choose an equilibrium ensemble 
of isotropic oscillators. Equating the calculated values of V{pc) and V{pq) 
in the high-temperature limit then yields 

K{Tq) = h^K{Tc), (45) 

for the volume of a pure state, where h is Planck's constant. Thus the 
Bohr-Sommerfeld quantization rule that a pure quantum state occupies a 



phase-space volume of is recovered ^2 



Eq. (55) can be used to derive semi-classical uncertainty relations from 
geometric considerations. For two corresponding ensembles pq and pc as 
above the position and momentum entropies Sx and Sp respectively must 
be approximately equivalent for either ensemble. Further, 

exp(S'x) exp(S'p) > exp(S'(pc)) (46) 
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holds for the classical ensemble from the projection property Eq. (^) applied 
to projections onto the position and momentum axes. Eqs. (0), (^51) and 
(HSp then yield the approximate inequality 



Sx + Sp - S{pq) <^ n\nh (47) 

for quantum ensembles which have classical limits. I conjecture that exact 
inequality in fact holds for all quantum ensembles. 

Since the entropy of a quantum ensemble has a minimum value of (cor- 
responding to the existence of a minimum volume for quantum ensembles), 
it follows from Eq. (^) that one has the semi-classical entropic uncertainty 
relation 

Sx + Sp^n\nh, (48) 

for quantum ensembles with classical limits. As per the derivation of Eq. (p!0|) 
from Eq. (|^), the corresponding semi-classical Heisenberg uncertainty rela- 
tion 

AXAP ~ h/e (49) 

then follows for the n = 1 case. Eqs. (|48|) and (|49|) are close to the exact 
results for general quantum ensembles ^ (see Eqs. (^ and (jT^)). It 
is seen that geometrically they correspond to application of the projection 
property Eq. (|26|) to the projections of a pure state of volume h"- onto the 
position and momentum axes (i.e., replacing Fi and r2 by X and P in Figure 
2). 



C Information bounds 

Consider a communication channel where signals represented by ensembles 
pi, P2, • • • are transmitted with prior probabilities pi, P2, • • • respectively . 
The ensemble of signal states itself corresponds to the mixture 

i 

For classical ensembles it was shown by Shannon |T^, ^ that the average 
amount of error-free data / which can be obtained per transmitted signal, 
measured in terms of the number of binary digits required to represent the 
data, is bounded above by 

I<[S{p)-Y.p,S{p,)]/\n2. (51) 
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The formally equivalent bound for quantum ensembles was proved by Holevo 
34], and hence Eq. (|5T|) may be referred to as the Shannon- Holevo informa- 
tion bound. 



Proofs given in the literature of Eq. (|5TD for the quantum case are mathe- 



matically rather technical in nature, and quite different in character to proofs 



for the classical case |3^, However, the formal equivalence of the quan- 
tum and classical bounds suggests that a unified proof exploiting universal 
features of statistical ensembles may be possible. Indeed the construction of 
such a proof, based on simple volume arguments, was recently outlined in 

and will be elaborated on here. A second such proof, which reduces the 
general classical/quantum case to that of discrete classical noiseless channels, 
will also be pointed out. 

First, consider a message consisting of L signals chosen from the set {pi}- 
Such a message may be denoted by po, where a = {ii, ■ ■ ■ ^il) denotes the 
labels of the signals comprising the message. In the limit that L 00 the 
strong law of large numbers implies that the relative frequency of signal pi 
appearing in the message approaches pi with probability 1. It follows from 
the Cartesian property Eq. (ESf) that the volume of the message satisfies 



V^(Pa) ^ Kness = {p^T'' . (52) 



as L — > oo. Moreover, as will be shown below in Eq. (|56[) , the volume 
of any ensemble of such messages is bounded above by [K(p)]'^. Hence, 
using the additivity property Eq. (|27|) , the maximum possible number of 
non-overlapping messages of length L, Nl, satisfies 

Nl < [V(p)]VKne.. (53) 

as L — > oo. Noting that error-free data can only be obtained from distin- 
guishing among a set of non- overlapping messages, and that such mes- 
sages require at most l + logj A^^^: binary digits to record, it follows in the limit 
of infinitely long messages that the average information gained per signal, /, 
is bounded by 

/ < hm L-\l + \og2NL) < \og2V{p)/\{[V{p.;)r. (54) 

L — yoQ 

I 

Finally, since communication based on finite message lengths cannot trans- 
mit more data per signal than communication based on infinite lengths, the 
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bound holds for all signalling schemes, and Eq. (^Tj) follows from Eqs. (pO|) 
and (^). 

The above proof of the Shannon-Holevo bound is geometrically simple, 
being based on the ratio of the maximum available volume for an ensemble 
of messages to the message volume (Eq. (|53|) ). Note that the argument 
cannot be used to derive similar bounds based on other invariant volume 
measures, as all of the defining properties of ensemble volume are required. 
However, heuristic arguments of the same type for other volume measures 
can sometimes give excellent results P, |^. Note that the Shannon-Holevo 



bound is in fact tight for both classical and quantum ensembles [p!9| , p^ , ^ 
corresponding geometrically to being able to choose a number Nl of messages 
arbitrarily close to the upper bound in Eq. which can be distinguished 
with a vanishingly small average error probability as L oo. 

To conclude this subsection it will be shown that the Shannon-Holevo 
bound may also be proved by considering only messages of finite length, and 



applying the classical noiseless coding theorem [|T9], With notation as 



above, suppose that one chooses a set of codewords C from the set of messages 
of length L, and that codeword G C is transmitted with probability q{a). 
Defining Ni{a) as the number of times signal pi appears in codeword pa, 
and Pi = J2aec Q{(^)Pii ^ the average l-th component of the transmitted 
codewords, consistency requires that 

Pi = Qia)Ni{a), 

p = L-'j:p, (55) 

1=1 

Using the projection property Eq. (|26|) and Eq. (|37|) one then has the in- 
equality chain 

v^(E ^(«)p«) < yipi) ■ ■ ■ yipL) < [vii: i-'pi)]' = [v{pt. m 

a I 

To obtain a bound for error-free data, it must be assumed that the code- 
words are non-overlapping, so that they can be distinguished without error 
by measurement. From Eq. ( |30D and the Cartesian property Eq. ( pSf ) one 
may then calculate 

^(E?(«)p«) = e'^'^ n [^(p.)]'^^"^ = e^f^' n n[^(p.)]'^^"^ (57) 

a aeC a£C I 
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where S[q] denotes the entropy of the discrete distribution {q{a)}. Combin- 
ing this with Eqs. ( ^Sf ) and (|56|) then gives 

S[q] < L5(p)-EE9(«)^(P.) 

Q6C I 

aeC i 

= L[S{p)-Y.p,S{p,)]. (58) 

i 

Finally, from Shannon's classical noiseless coding theorem [T^, S'[g]/ln2 



is the maximum information (measured in binary digits) which can be trans- 
mitted on average per codeword, and hence Eq. (^1]) follows for the average 
information transmitted per signal. 

D Chaotic and other diffusion processes 



Zyczkowski [0] and Mirbach and Korsch [0, |T2[ have studied connections 



between quantum and classical chaos via entropies associated with the evo- 
lution of coherent states. Here it will be shown that this approach may be 
simply interpreted in terms of ensemble volume, and considerably generalised. 

Consider an ensemble po, classical or quantum, which evolves in time 
under some dynamical process D (not necessarily reversible). The ensemble 
will explore some region of F, which may be large for standard diffusion 
processes, or relatively small for integrable and dissipative systems. The 
localisation of the ensemble in F over time is characterised by the time- 
averaged mixture 



rT 

1-1 



p = lim T-^ / dtpt. (59) 

T^oo Jo 

This mixture gives greatest weight to regions of F where the ensemble spends 
the most time. Hence its ensemble volume, V{j)), is a measure of the spread 
of the region explored by the ensemble as it evolves. 

The localisation ratio for a given initial state and dynamical process may 
now be defined as the ratio of the volumes of p and po, i.e., 

r = V{-p)/V{po) = exp[S(p) - S{p,)]. (60) 

It thus measures the localisation of the ensemble under the evolution process, 
relative to its initial spread. This ratio will be less than or equal to one if the 



26 



ensemble evolves to a fixed point, and greater than or equal to one if it diffuses 
over the whole of F. For chaotic systems with integrable regions it will depend 
strongly on the initial ensemble. The above definition is clearly natural on 
geometric grounds, and the ensemble entropy appears as a consequence of 
the uniqueness theorem in Eq. (0). 

For classical and quantum systems corresponding to the same evolution 
process, it is of interest to compare localisation properties. This is easily 
done for the case of initial quantum ensembles pq which have corresponding 
classical counterparts pc (such as coherent states). In this case the quantum 
and classical localisation ratios rq and rc can be calculated and compared. 
Zyczkowksi partially carries through this procedure in [jlOl, where he plots 
S{j)) for the quantum counterpart of a classically chaotic process, where pq 
is chosen to range over a set of coherent states indexed by their correspond- 
ing phase-space points. In this case S'(p) is just the entropy of the energy 
distribution of pq. Noting S{pq) = for pure states, it follows from Eq. (|60|) 
that this is equivalent to plotting the logarithm of the localisation ratio, Inr. 
However, he compares quantum localisation features qualitatively with the 
classical phase space portrait, rather than quantitatively with analogously 
calculated classical localisation ratios. 

Mirbach and Korsch extended the approach of Zyczkowski by also cal- 
culating S'(p) for the classical ensembles pc corresponding to the coherent 
states Pq. For a complete family of such states they then compared the cor- 
responding classical and quantum values of S{p) (Figures 1 and 3 of ||12| ). 
Since for this case S{pq) and S{pc) are constants, this amounts to compar- 
ing the logarithms of the classical and quantum localisation ratios (up to an 
additive constant). 

However, Mirbach and Korsch argue that one should in fact compare mea- 
surement entropies rather than the direct ensemble entropies, to smear out 



quantum fluctuations in the latter case 12]. This is also easily interpreted 



in terms of localisation ratios. In particular, for a measurement observable 
y4 on a classical or quantum ensemble p, let Va{p) denote the volume of the 
measurement distribution of A. The localisation ratio of an evolution process 
with respect to A, for an initial ensemble po, is then defined in analogy to 
Eq. (|0D as 

rA = Va{p)/Va{po). (61) 
Again one may compare localisation ratios for classical and quantum ensem- 
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bles, where one chooses corresponding observables Aq and Ac- The loga- 
rithm of this quantity (up to an additive constant) is plotted in Figures 2 
and 3 of [|TT] for quantum and classical systems respectively for a complete 
set of coherent states, where Ac is chosen to be a phase-space measurement 
(so that = ^c), and Aq to be a "Husimi" phase-space measurement 
corresponding to the complete set of coherent states . 



V CONCLUSIONS 

In conclusion, an essentially unique measure of volume for classical and quan- 
tum ensembles has been found, related to ensemble entropy, which provides 
a geometric tool for any context in which ensembles appear. This measure 
is universal in the sense that it may be defined by theory-independent con- 
cepts of invariance, uncorrelated ensembles, projection, and non-overlapping 
ensembles (properties (i)-(v)). 

Its properties as a direct measure of "spread" have been investigated in 
Sec. II for continuous distributions, and favourably compared with measures 
based on root-mean-square deviation. New geometric characterisations of 
ensemble entropy and relative entropy have been discussed in Sees. III.D 
and III.E. 

Applications include a new definition of spot size for optical beams; a 
precise geometric interpretation of the Boltzmann relation in statistical me- 
chanics; a derivation of semi-classical uncertainty relations based on the exis- 
tence of a minimum volume for quantum states and a projection property of 
volumes; a unified derivation of results in classical and quantum information 
theory based on simple volume ratios; and a new and universal definition of 
a localisation ratio which measures the time-averaged spreading of an ensem- 
ble and underlies entropic measures previously investigated in the context of 
quantum chaos. 

Work is in progress on further applications, particularly to quantum infor- 
mation theory [^, measures of quantum entanglement [^], and information 



exclusion relations [^, The conjecture suggested following Eq. ( ^7[ ) is also 
under active investigation, and the (mostly weaker) bound 

Sx + Sp- S{p) > In 2Tteh - ln[l + AXAP/{h/2)] (62) 

has thus far been found for the n = 1 case. 
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APPENDIX 

Here the fundamental theorem stated in Sec. HI.C is proved, showing 
essentially that the exponential of the ensemble entropy is the unique measure 
of the volume of a statistical ensemble. It is convenient to first prove the 
theorem for discrete classical ensembles, and then extend the arguments to 
quantum ensembles and to continuous classical ensembles. Notation will be 
as defined in Sec. HI. A, and reference will be made to the five assumed 
properties of the volume measure V{p) stated in Sec. III.B. 

Let p denote a classical discrete ensemble {pi}, with finite index set I = 
{1,2,..., M}. Defining the "pure" ensemble pj (j G /) as corresponding to 
the distribution {pp^} with = Sij, one can write p as the mixture 

P = ^PiPi- (63) 

Note that one has the two basic properties 

Trr[PjPk] = {j ^ k), ViPj) = constant = V}. (64) 

The first states that these pure ensembles are non-overlapping, and the sec- 
ond that they have equal ensemble volumes (this follows from the invariance 
property, noting that the pj map to each other under permutations). 

Now consider the ensemble p^ e corresponding to L uncorrelated 
copies of p. For each a — (ii, 12, ■ ■ ■ , ih) in I'" define 

Pa = PhPi2 ■■■PlL^ = PhPi2 ■■■PlL- (65) 

Thus Poi corresponds to the uncorrelated composite ensemble formed by 
Pi-^, Pi2, . . . , Pij^ (in that order). One can then decompose p^ into the mix- 
ture 

P^ ^ (66) 
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The proof of the theorem proceeds by finding a suitable set of so-called 
"typical sequences" T C [|l^, |2^, which allows in Eq. (^) to be 
approximated by certain mixtures of the ensembles {pa} where a is restricted 
to range over T. 

For a given a ^ let Ni{a) denote the number of times the index i 
appears as a component of a, and let P{a) e correspond to a permutation 
of the components of a. If S{p) denotes the entropy of p defined in Eq. (^) 
of the text, then for any e > and L sufficiently large one may choose a set 
T, with I T I elements, which satisfies: 

(Tl) Cr=Ep(")>l-e, 
{T2) I T 1= e^^^^''^^^'^\ 

(T3) I L-^N,{a) -pi\<5'L for all a G T, 

(T4) aeT implies P{a) G T for all P, 

where both 5^ and 5^ — > as L ^ oo. A particular example of such a set is 

T = {a:\ L~'N,{a)-p, \< [Mp,{l - p,)/{Le)f^}. (67) 

Properties (Tl) and (T2) for this set are proved in Theorem 1.3.1 of [p!9|]; 
property (T3) follows noting that J2i[Pi{^ ~ Pi)]^^"^ is bounded by (M — 1)-*^/^ 
and hence that one can choose 5'^ = M(Le)~^/^; and property (T4) is an 
immediate consequence of Ni{a) being invariant under permutations. 

To obtain an upper bound for the volume V{p) of p, consider now the 
ensembles defined by the mixtures 

PLiT) = C^' J2 PliT) =1 T \-' Y: Pa, (68) 

where Ct = J2aeTP{c()- From the Cartesian property and Eqs. (|^) and 
( |03D it follows that V{pa) = [Vi]^ is constant, and further that the pa 
are non-overlapping. Hence, from the uniformity and additivity properties, 
VipLiT)) < V{pl{T)) =1 T I [Vi]^. Property (T2) then gives 

V{pl{T)) < [l^,]^e^[^('')+^^l (69) 

Further, from property (Tl) and Eqs. ( pBD and ([55|), 

Trr.[| p^-Pl(T) I] = ^ | p(«) - p(«)/Ct | + E 

= (1/Cr - 1)Ct + (1 - Ct) < 2e. 
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Hence can be made arbitrarily close to Pl{T) for L sufficiently large, 
and so from the assumed continuity of V^(-), and noting from the Cartesian 
property that V{p^) = [V{p)]^, one has from Eq. ( pP| ) that 

V{p) = lim [l^(pi(T))]i/^ < Vie^^p\ (70) 

L— »oo 

Thus the exponential of the entropy is an upper bound for the ratio of the 
volume of p to the volume of a "pure" state. Note that only properties (Tl) 
and (T2) of T were needed to obtain this result, and that the projection 
property has not been used. 

To obtain the converse of inequality Eq. ([70D, note from the projection 
property that 

V{pl{T))<flV{p,{T)), (71) 
1=1 

where P;(T) is the projection of p*i{T) onto its /-th component, i.e., 

-pi{T)= E l^rV.. (72) 

a=(h,...iL)eT 

From property (T4) of T, p;(T) is independent of / and hence may be denoted 
by p. Eq. ([7l|) then becomes V{p*j^{T)) < \V(j>)]^ . But as noted earlier, the 
volume of V{p*j^{T)) follows from the additivity property as | T | [V/]^, and 
hence via property (T2) of T Eq. ( fflD reduces to 

y^e5(p)+5^ <V{-p). (73) 
Further, from Eqs. ( pBD and (^) 

-p = L-'Y.-Pi{T)=\T\''Y.T.L-'N:{a)p,. (74) 

I aeT iel 

and hence from Eq. (|63D and property (T3) of T 

Trr[|p-p|] = \T\-'TTr[\Y.Y.iP^-L''N,ia))p,\] 

< \T\-'J2J:\P^-L-'N,ia)\<5l 

Hence p can be made arbitrarily close to p for L sufficiently large, and so, 
taking the limit L — > oo in Eq. (0), the assumed continuity of V{-) gives 

Vje^ip) <V{p). (75) 
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Eqs. ( |70D and (|75|) yield the theorem of Sec. III.B for classical discrete 
ensembles with finite index sets (where K{r) in Eq. (^) is identified with the 
volume Vj of a pure ensemble {pi = 6ij} on /, and Eq. (^) for K{T) follows 
immediately from the Cartesian property). The extension to ensembles with 
infinite index sets is trivial by continuity. The distribution {pi} of such an 
ensemble p can be arbitrarily closely approximated by its (renormalised) first 
M terms, corresponding to a discrete ensemble pm with a finite index set. 
Hence, from from the assumed continuity of ensemble volume and Eqs. ([fO|) 
and (f75|), V{p) = V/ limM^oo exp[S'(pM)] where Vj is the volume of a "pure" 
ensemble with respect to the infinite index set /. Thus V{p) is as per the 
theorem (but becomes infinite in the case that the limit of S{pm) as M — > cx) 
does not exist). 

The extension to quantum ensembles is straightforward. Indeed, for quan- 
tum ensembles the above analysis goes through formally unchanged, where 
the expansion in Eq. (^) is now identified with an orthogonal decomposition 
into pure states, and the first product in Eq. ( |65|) is a tensor product. Thus 
the Pi and pi represent (non-overlapping) eigenstates and eigenvalues of p. 
The only additional consideration is that Vj, the volume of an eigenstate of p, 
might conceivably depend on the eigenstate basis. However this is ruled out 
by the invariance property (i): all pure states on a given Hilbert space can 
be connected by unitary transformations, and hence have the same volume. 

Finally, the theorem may be extended to continuous classical ensembles 
as follows. Consider a classical ensemble p described by a probability distri- 
bution p(x) on an n-dimensional space X. This space may be partitioned 
into a set {Si} of non-overlapping sets of equal volume V (i.e., J^. (i^x = V 
for all i). Define the corresponding "pure" ensembles pi by the associated 
probability distributions p'^*^(x) = 1/V for x E Si and = for x ^ Si. 
These pure ensembles can be mapped to each other by measure-preserving 
transformations, and hence from the invariance property have equal ensem- 
ble volumes, Vo(l^) say. The formal analogues of the properties in Eq. (|5^) 
then hold, and again the above analysis for classical discrete ensembles goes 
through formally unchanged for mixtures of these pure ensembles, i.e., 

ViT.P^P^) = Vo{V)expi~Y.Pi^ogpi). (76) 

i i 
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Now consider the particular mixture defined by 

PV = T.P^iy)P^^ PiiV)= [ rpi^). (77) 

Thus pv is a discrete approximation to p, and hence, noting that /^^ (i"x = 
Si !s d^^i has from the Mean Value Theorem that 

Trr[| p - Pv I] = E / ^"^ I - H (78) 

in the continuum limit ^ — > 0. Hence, from Eq. (^) and the assumed 
continuity of ensemble volume, 

V{p) = \^mV,{V)eMSv) (79) 

where Sy denotes the entropy of {piiV)}. But again approximating an inte- 
gral by a summation, 

5(p) = hm-FE[P.(V^)/^]ln[p.(l^)/V^] =115^(5^ + 1111^). (80) 

i 

Hence Eq. ( [79|) can be rewritten as 

V{p) = e'^^'^ ^mVo{y)/V. (81) 

Finally, to show that the limit exists in Eq. (|8TD, note that any set 5* G X 
of measure (i"x = V can be partitioned into m non-overlapping sets of 
equal measure V/m for any integer m. Moreover, a "pure" ensemble on S, 
corresponding to a distribution which is uniform over 5* and vanishing outside 
5, can trivially be written as an equally-weighted mixture of analogously 
defined ensembles for the members of the partition. Hence from the additivity 
property one has the relation Vo(\^) = mVoiy/m) for the ensemble volumes 
of "pure" ensembles. Further, replacing V by nV in this relation for any 
integer n implies that Vo(rV") = rVo{V) for any rational number r = n/m. 
This can be extended to all real r from the assumed continuity of ensemble 
volume, so that Vo{V)/V = constant = K{r) say, and the theorem follows 
via Eq. (|81|). 
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TABLE I. Examples of ensemble length and RMS deviation 



Distribution p(x) Lx AX 



Uniform pu{x) = l,0<x <1 1 1/(2^3) 

Circular pc{x) = 2(1 - x^y/^/n, \ x \< 1 vr/^ 1/2 

Gaussian pg{x) = {27i)-^^^ exp{-x^ /2) {2T:efl^ 1 

Exponential Pe{x) = exp(— a;), a; > e 1 

Sink-squared pss{x) = Ti~^[sm{x) /x]"^ 7re^^^~^^ " 

Cauchy-Lorentz pc — 7r~^ / {1 + x'^) An 

Double-uniform pdu{x) = 1/2, <| a; | -a < 1 2 [1/3 + a(a + 1)]^/^ 



~ 0.57721566 denotes Euler's constant 
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FIGURE CAPTIONS 



FIG. 1. Two uncorrelated ensembles pi and p2 on spaces Fi and F2 re- 
spectively (shown here compressed to 1-dimensional axes), have respective 
volumes V{pi) and V{p2) as indicated by the darkened axis regions. The 
Cartesian property Eq. (^Sf) states that the corresponding joint ensemble p 
has a "rectangular" volume V{p) = V{pi)V{p2), i.e., V{p) corresponds to 
the Cartesian product of volumes V{pi) and V{p2)- 

FIG. 2. An ensemble p on the product space of Fi and F2 has a volume 
V{p) indicated by the solid closed curve. The corresponding projected en- 
sembles pi and P2 on Fi and F2 respectively have projected volumes V{pi) 
and V{p2), indicated by the darkened axis regions. The projection property 
Eq. (^) states that V{p) can be no greater than the volume of the rectangu- 
lar region formed by the dashed lines, i.e., than the product of the projected 
volumes. 



38 



i-:r 




